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ABSTRACT 

The halo mass function from N-body simulations of collisionless matter is generally 
used to retrieve cosmological parameters from observed counts of galaxy clusters. This 
neglects the observational fact that the baryonic mass fraction in clusters is a random 
variable that, on average, increases with the total mass. Considering a mock catalog 
that includes tens of thousands of galaxy clusters, as expected from the forthcoming 
generation of surveys, we show that the effect of a varying baryonic mass fraction will 
be observable with high statistical significance. The net effect is a change in the overall 
normalization of the cluster mass function and a milder modification of its shape. Our 
results indicate the absolute necessity of taking into account baryonic corrections to the 
mass function if one wants to obtain unbiased estimates of the cosmological parameters 
from data of this quality. We introduce the formalism necessary to accomplish this 
goal. Our discussion is based on the conditional probability of finding a given value 
of the baryonic mass fraction for clusters of fixed total mass. Finally, we show that 
combining information from the cluster counts with measurements of the baryonic 
mass fraction in a small subsample of clusters (including only a few tens of objects) 
will nearly optimally constrain the cosmological parameters. 

Key words: cosmology: cosmological parameters, large-scale structure of universe, 
theory, clusters: general, methods: statistical 



1 INTRODUCTION 



Galaxy clusters are powerful cosmological probes (e.g . 
Vikhlinin etaHl2009l ; ICunha et al.ll2009l : lMantz et al.ll20ia 



Rapetti et alj 20121 ). They are expected to be associated 
with massive haloes of dark matter. Therefore, their ob- 
served number density is generally compared with theoret- 
ical models for the halo abundance which are calibrated 
against large N-body simulations of collisionless matter. 

The key theoretical quantity is the halo mass function 
nh{M,z) which is defined so that nh(M, z)dM gives the 
number of dark-matter haloes (at redshift z) with mass be- 
tween M and M + dM per unit comoving volume. The cali- 
bration of the halo mass f unction has been pushed to a preci- 
sion o f ~ 5 per cent (e.g 
2006; iTinker et all 12008 



Jenkins et al. 2001: 1 Warren et al.l 
. _. .Pille pich ct al.l |2010D and can be 
even further improved (e.g. Reed et all 2012 ) . However, this 
precision is somewhat illusory for direct applications to 
galaxy clusters. In fact, numerical simulations that, more re- 
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alistically, also co nsider a baryonic component on top of the 
dark matter (e.g. Istanek et al.ll2009l: iMcCarthv et al IlioiC ; 
ICui et alj I2OI2I ; Ivan Daalen et al.l l201ll ; iRasia et alj l2012h 
have sho wn that gas phys ics can alter cluster masses by ~ 10 
per cent |Cui et al.ll2012l ) with respect to pure N-body sim- 
ulations. Consequently, the clu ster abundance at fixed mass 
changes by ~ 10 — 20 per cent l|Stanek et al.|[2009h while the 
correspon ding 2-point clus t ering amplitude varies by ~ 10 
per cent l|Rudd et al.ll2007l : Ivan Daalen et alj|201ll ). Due to 
the large uncertainties in modeling baryon physics (i.e. ra- 
diative cooling, star formation, and different feedback mech- 
anisms in the presence of stars and active galactic nuclei), 
current gas simulations can only provide order-of-magnitude 
approximations of these effects. 



State-of-the-art estimates of the cosmological param- 
eters from cluster abundances account for the effects of 
baryons by artificially enlarging the uncertainties in the pa- 
rameters of the halo mass function extracted from N-body 
simu lations and marginalizing over them (e.g. iMantz et al.l 
l2010l) . This is a statistical trick to account for possible differ- 
ences between the shapes of the halo and cluster mass func- 
tions. Such an approach is adequate for the current samples 
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of clusters which pr obe relatively small volumes and have 
large error bars (e.g. iBalaguera-Antolfnez et al.ll2012T ). 

Forthcoming cluster surveys will probe much larger vol- 
umes with higher sensitivities. In this Letter, we show that, 
in order to use the new data to derive cosmological con- 
straints that are not only precise (i.e. small statistical er- 
rors) but also accurate (i.e. unbiased estimates), it will be 
imperative to model the baryonic effects on the cluster mass 
function. Following a statistical approach, we describe the 
effects of gas physics in terms of the mean baryonic mass 
fraction as a function of cluster mass, /b(M), and of the 
corresponding dispersion around the mean a b - We focus our 
attention on two variables: the matter density parameter, 
f2 m , and the linear rms matter fluctuation within a spherical 
top-hat window of radius 8/i _1 Mpc, as- Considering obser- 
vationally motivated guesses for the shape and amplitude of 
the function fb{M) and of the scatter a b , we first quantify 
the bias in the estimates for f2 m and as caused by using 
the halo mass function as a proxy for the cluster mass func- 
tion. Subsequently, we explore the option of using extra free 
parameters in the models to simultaneously determine from 
the cluster counts both the cosmological parameters and the 
relation /b(M). We find that this procedure increases the un- 
certainties on Sl m and as- Finally, we show that considering 
additional information on the baryonic mass fraction from 
follow-up studies of a small subsample of clusters is pivotal 
to recover unbiased estimates for fi m and as with small un- 
certainties. We conclude that taking into account the effect 
of baryons on the cluster mass function is key to fully exploit 
the potential of forthcoming large- volume cluster surveys as 
cosmological probes. 

We adopt a fiducial cosmological model based on a 
flat ACDM Universe with a matter density parameter of 
Cl m = 0.258, a baryon density parameter of fib — 0.044, 
a dimensionless Hubble parameter h — 0.735 (in units of 
100 kms~ 1 Mpc~ 1 ), a linear rms mass fluctuation within 
8 ft -1 Mpc of as = 0.773 and a scala r spectral index 
w s = .954. We use the transfer function of lEisenstein &: Hul 
( 1998) to compute the linear matter power spectrum. Clus- 
ter and halo masses are defined within spherical overdensi- 
ties of A = 500 times the critical density of the Universe. 
The halo mass func i on is calculated using the fitting formu- 
lae of iTinker et"afl (|2008h . 



2 THE OBSERVED MASS FUNCTION 

Dark-matter particles in N-body simulations are assigned 
masses proportional to the total matter content of the Uni- 
verse, i.e. m p oc fi m = fib + fidm (in terms of the bary- 
onic and dark- matter components). Thus, by construction, 
the mass of simulated dark-matter haloes (identified with 
some practical algorithm) includes a baryonic fraction which 
coincides with the cosmic baryon fr action / c = fib /fim ■ 
However, obser vational studies (e.g. ICiodini et al. 1 120091; 
Lin et al. 2012h and hydro s imulations (e.g. Stanek et al.l 
20091 ; iMcCarthv et all l20ld : ICui et all 120121 ') have shown 



that the baryon fraction in galaxy clusters lies below f c and 
varies with the cluster mass. This is a consequence of the 
complex baryonic physics taking place in galaxy clusters and 
its dependen ce on the dep th of the gravitational potential 
well (see e.g. ISarazin|[l98ct ). 
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Figure 1. (a) Ratio between the cluster mass function n and a 
model assuming that the baryonic mass fraction is always equal 
to the cosmic value (i.e. Mtot = Af n b), no- Line styles correspond 
to different values of the parameter A in Eq. (2j as indicated in 
the plot. The parameters B and fXb are kept fixed at their fiducial 
values, (b) Signal-to-noise ratio for the difference n — no as a 
function of the observed cluster mass in 20 equispaced log-bins. 
A survey volume of 0.8 (/i _1 Gpc) 3 centered at z = 0.1 is assumed. 



Let us define a cluster as the spherical region centered 
on a density peak and enclosing a certain overdensity A 
times the critical density of the universe. Each cluster will 
be characterized by a well defined total mass Mtot (such that 
A/tot = Afb + Mdm and a baryon fraction /b = Mb /Mtot • 
Therefore, for a given dark-matter mass, Mdm, clusters in 
N-body simulations have total masses, M n b = Mdm/(1 — fc), 
which are overestimates of the actual cluster mass 



Mtot = 



fc 



fb 



M n 



(1) 



The baryon fraction fb is a stochastic quantity that varies 
from cluster to cluster. Based on observatio nal studies 
(|Giodini et al. ll2009l ; lL~in et al.|[2012l ; ISun II2012T ), we assume 
that the conditional probability density P(/b|Mtot), is well 
approximated by a Gaussian distribution with mean 



(/b|Mtot) =Af c 



M t( 



2 x 10 14 /i-!M Q 



(2) 



and s catter a b - We adopt the results by ICiodini et al. I 
l|2009tb A = 0.72±0.01, B = 0.09±0.03 and a b ~ 2x 10" 3 , 
as fiducial values but it is worth stressing that other studies 
favour slightly different values for these parameters. 

The conditional probability density P(M to t |Af n b) can 
be determined by generating Monte Carlo realizations of 
the pair (/b, Mtot) as follows. For a given M n b, we first draw 
a pseudorandom number A/b from a Gaussian distribution 
with zero mean and rms value o"b- We then insert the rela- 
tion / b = (/b I Mtot) + A/b into Eq. Q and solve the result- 
ing equation for Mtot by iteration starting from the initial 
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Case 


A K ^ ^1 ,-.1 

Model 


Notes 


Data 


Description 


I 


no 




'^obs 


The halo mass function from N-body simulations is used to fit the observed mass function 


II 


no 


t cov 


^obs 


As in I but after artificially inflating the covariancc matrix of the n n parameters 


111 


n 




^obs 


Accounting for a varying baryon fraction and simultaneously fitting P (/ b \M tot ) 


IV 


n 




ra obs , fb 


As in III combining the mass function data with 30 measurements of f b 


V 


n 




^obs 


As in III but assuming perfect a priori knowledge of -P(/b Aftot) 


Iq 


no 




^obs 


As in I marginalizing over the mass-measurement error with a Gaussian prior (see text) 


If 


no 




^obs 


As in I marginalizing over the mass-measurement error with a broad flat prior (see text) 


iv g 


n 




n obs i fb 


As in IV marginalizing over the mass-measurement error with a Gaussian prior (see text) 


IV F 


n 




"obsi fb 


As in IV marginalizing over the mass- measurement error with a broad flat prior (see text) 



Table 1. Summary of the Baycsian-infcrcnce cases discussed in this Letter. Note that no is obtained assuming that all clusters have 
Mtot = Af n b (i.e. /b = f c ) in Eq. J4]l while n takes into account that the baryonic mass fraction varies from object to object. 



guess Mtot = M nb . We find that P(M to t\M nh ) is well ap- 
proximated by a log-normal distribution with log-scatter Q 
°in M to , | A/ nb which slightly increases with M n b. In our anal- 
ysis we have directly used the value of this intrinsic scatter 
derived from the Monte Carlo realizations. However, for il- 
lustration, it is worth mentioning that the scatter can be well 
described as o"inA/ tot |JV/ nb ~ 

8.5xlO- 4 (l+0.121og 10 M nb ) for 
our fiducial values of the parameters A, B and at. 

Cluster masses are not observed directly and need to 
be inferred from observational proxies (e.g. optical richness, 
X-ray temperature or flux, Sunyaev-Zel'dovich signal, lens- 
ing shear). We assume that the observed mass, M b s , is an 
unbiased estimate of the total mass, with a log- normal prob- 
ability density function of th e residuals (e.g iLima fc Wu I 
12004 ICunha fc Evrard Il2009h , for which we use a constant 
log-scatter a\ n M b |M to t =0.1. We can then write the mass 
function of galaxy clusters in terms of their observed mass 
as 

n(M obs ,z)=/ n h (M nb ,z)P(M obs \M nh )dM nh , (3) 
Jo 

where 

P(M obs |M nb ) = / P(M obs \M tot )P(M tot \M nh )dM tot . (4) 
Jo 

Based on our assumptions listed above, -P(M b s |M n b) can 
be written as a log-normal distribution with log-scatter 

WhxM obB \M to t + a ?nM tot \M nb ] 1/2 - Note that the main effect 
of the variable baryon fraction is to introduce a systematic 
mass-dependent offset between M b s and Md m while the cor- 
rection to the intrinsic scatter plays a sub-dominant role, at 
least for mass proxies with broad distributions. 

In panel (a) of Fig. [T]we compare the cluster mass func- 
tion, n(M ob s), with its counterpart obtained by neglecting 
the effects of the varying baryon fraction (i.e. assuming that 
Mtot = M n b) that we dub no(M b s ) - note that no differs 
from n n due to the mass-measurement errors. Each curve 
corresponds to a different value of A in Eq. @ while B and 
ab are kept fixed at the fiducial values indicated above. Two 
main effects are visible: an overall reduction of the mass- 
function normalization (by ~ 10 — 15 per cent) and a more 
severe suppression of the abundance of massive clusters with 



In this Letter, logarithms of the cluster mass to base 10 and e 
are always taken after measuring the mass in units of h~ Mq. 



respect to the less massive ones. Both effects are enhanced 
for lower values of A. 

Will the discrepancy between the actual cluster mass 
function, n(M ob s), and the predictions of N-body sim- 
ulations (convolved with the mass-measurement error), 
no(M ba), be noticeable with future observational cam- 
paigns? As a prototype for the forthcoming cluster sur- 
veys, we consider a catalog spanning a comoving volume 
of 0.8 (/i _1 Gpc) 3 (corresponding to a survey covering the 
full sky down to z < 0.2 or half of the sky to z < 0.25) 
and containing ~ 2.79 x 10 4 entries in the mass range 
10 13 - 5 < M obs /(h~ 1 M Q ) < 10 15 (for A = 0.72) with mean 
redshift z = 0.1. We assume full completeness in the mass 
range we are probing and compute the cluster mass func- 
tion in 20 mass bins of width Ai ogl0 A/ obs = 0.075. Assuming 
Poisson errorbars of size a, in panel (b) of Fig. [T] we show the 
signal-to-noise ratio for the difference n(M b s ) — no(M obs ) as 
a function of the cluster mass. Highly statistically significant 
deviations are detectable for M obs < 10 14 ' 5 /i _1 M Q . 



3 COSMOLOGICAL PARAMETERS 

We want to quantify the bias and the uncertainty in the 
measurement of the cosmological parameters f2 m and erg ob- 
tained by fitting the observed cluster mass function, n obB , 
with different models. In order to do this, we use the mock 
cluster catalog described in the previous section and we sam- 
ple the posterior distribution of the model parameters with 
a Markov Chain Monte Carlo algorithm. We use different 
models and combinations of data which are briefly sum- 
marized in Table Q] and extensively described below. The 
(marginalized) posterior mean and rms values for fi m and 
as are given in Table [2] together with the corresponding "fig- 
ure of merit" (FoM, defined as the inverse of the area of the 
joint 68.3 per cent credibility region in the {r2 m ,crg} plane). 
The joint 95.4 per cent credibility regions are instead shown 
in panel (a) of Fig. [2] 

First, we consider the mass function extracted from N- 
body simulations with no corrections for a varying f b (case 
I). Specifically, we use the function no(M b s ) to fit the ob- 
served mass distribution. The resulting estimate for fi m is 
significantly biased low. To first order, this is because the 
normalization of the function no scales proportionally to fi m 
and, as shown in panel (a) of Fig.[T] the effect of the varying 
baryon fraction is to reduce the overall normalization of the 
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observed cluster counts. On the other hand, as is slightly bi- 
ased high as expected from the location of the exponential 
cutoff in the mass function. However, the bias is not very 
significant given the corresponding statistical uncertainty. 
The size and sign of the bias on erg markedly depends on 
the fiducial values adopted in Eq. ([2} while this is not true 
for f2 m , which is always biased low for A < 1. 

To gain freedom in the shape of the theoretical mass 
function and reduce systematic effects when fitting current 
data, it is common practice to let the parame ters that de- 
fine r th vary within some predefined range (e.g. iMantz et al.l 
2010). We have investigated what happens applying this 
technique to our mock sample. In this case we have allowed 
the parameters of the mass function to vary within a four- 
dimensional Gaussian prior. We built the covariance matrix 
of the prior by multiplying the original covariance matrix of 
the parameters (kindly made available by Jeremy Tinker) 
with a positive constant so that the halo mass function at 
M = 10 15 h~ 1 M e is ~ 10 per cent uncertain (case II). With 
respect to case I, this method does not improve the bias 
of Sim and erg while the corresponding FoM decreases by a 
factor of ~ 3.6. 

In order to eliminate the bias, we release the assumption 
that Mtot = M n b in Eq. Q and replace no with n to fit n bs 
(case III). This way, we simultaneously constrain the cos- 
mological parameters and the scaling relation P(/b|M to t). 
This procedure is analogous to the "self calibration" method 
proposed to extrac t cosmological information from cluster 
surveys ((Hul l2003h . We have verified that this approach 
provides unbiased estimates of f2 m and as- However, the 
statistical uncertainty on the values of the cosmological pa- 
rameters constrained by the cluster counts are significantly 
larger with respect to case I and II. 

The situation markedly improves by considering addi- 
tional information on /b extracted from multi-wavelength 
studies of a small subset of galaxy clusters. To show this, we 
randomly select 30 clusters out of the full sample and imag- 
ine that the baryonic fraction of their mass content has been 
measured with 10 per cent precision (case IV). We then build 
new Markov chains assuming that the information from the 
measurement of the baryon fraction is independent of the 
cluster counts. The resulting estimates for fi m and erg are 
unbiased and errorbars small as expected from experiments 
for "precision cosmology" . In this case, the parameters of the 
scaling relation -P(/b|Mtot) are also recovered to good accu- 
racy, namely (to 68.3 per cent credibility), A = 0.72 ± 0.01, 
B = 0.09 ± 0.01 and a b = 4.5±|;f x 10 -3 . 

Finally, we consider the ideal case in which the scal- 
ing relation P(/b|M to t) is perfectly known from independent 
data (case V, not shown in Fig. [2]). This allows us to con- 
clude that case IV gives cosmological constraints which are 
nearly optimal. 

To simplify the discussion, so far, we have assumed 
that the scatter of the observational mass estimates for the 
galaxy clusters, a\ n M b \M b ; is perfectly known. This, how- 
ever, does not accurately reflect reality where one has only 
limited knowledge on the actual size of the measurement er- 
ror which should then be treated as a nuisance paramete r 
in the model-fitting procedure fe.g. lCunha fc Evrard tf 2009'). 
This will generally broaden the credibility region of the cos- 
mological parameters and, in principle, might reduce the 
statistical significance of the biased estimates obtained with 



Case 


S2 m 


as 


FoM 


Fiducial 


0.258 


0.773 




I 


0.245 ± 0.002 


0.780 ± 0.007 


2.4 x 10 4 


11 


0.244 ± 0.003 


0.780 ± 0.007 


6.7 x 10 3 


111 


0.259 ± 0.005 


0.782 ±0.012 


2.8 x 10 3 


IV 


0.257 ±0.003 


0.773 ±0.007 


2.1 x 10 4 


V 


0.258 ± 0.003 


0.772 ±0.007 


2.5 x 10 4 



Table 2. Mean and rms value of the marginalized posterior dis- 
tribution for !l m and <rg. The third column shows the figure-of- 
merit, defined as the inverse of the area of the joint 68.3 per cent 
credibility region. 

model no. In order to evaluate the impact of this subtlety 
on our results, we have repeated the measurement of the 
cosmological parameters (for case I and IV) after marginal- 
izing over "In M obs | M„ b • We have considered two cases. First, 
as a realistic option, we have adopted a Gaussian prior on 
<T inM obe! |M nb with mean 0.1 and rms error 0.02 (cases Ig and 
IVg) - note that a 20 per cent standard error of the stan- 
dard deviation corresponds to a sample of 14 objects. As a 
(very) pessimistic option, instead, we have considered a flat 
prior between < &\ n M b |M b < 1 (cases If and IVf). The 
corresponding joint posterior distributions for f2 m and erg 
are shown in panel (b) of Fig. [2] In all cases, the bias in the 
estimates based on the model no is still evident. 



4 DISCUSSION AND CONCLUSIONS 

The fractional baryon content of galaxy clusters, /b, is a ran- 
dom variable which, on average, decreases with the cluster 
mass. Therefore the cluster mass function must differ from 
the predictions of N-body simulations where the baryon frac- 
tion is implicitly held constant to the cosmic value. 

Considering a prototypic catalog of galaxy clusters of 
the forthcoming generation containing 2.79 x 10 4 entries, we 
have shown that constraints on the cosmological parameters 
f2 m and erg derived from the cluster mass function would be 
severely biased if this signal is modelled with fitting formu- 
lae based on N-body simulations of collisionless matter. In 
particular, f2 m would be always biased low while the bias on 
erg depends on the details of the scaling relations between 
/b and the cluster mass. 

The widespread technique of artificially inflating the co- 
variance matrix of the parameters that describe the halo 
mass function to gain freedom and minimize systematic ef- 
fect will not be of much help. Our study shows that it would 
enlarge the statistical uncertainties on the cosmological pa- 
rameters without eliminating (or reducing) the bias. 

In order to obtain accurate estimates for the cosmolog- 
ical parameters, complementary information on the fraction 
of baryons as a function of the total mass is required. The op- 
timal method to eliminate this systematic effect requires two 
ingredients: I) an accurate model for the conditional prob- 
ability density of finding a particular value for /b given the 
cluster total mass, -P(/b|M to t); II) A small, random subsam- 
ple of clusters with follow-up data for which simultaneous 
measurements of /b and Mtot can be made. We have shown 
that, if the scatter around the mean /b — Mtot relation, a b , 
is independent of mass, nearly 30 objects would be enough 
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Figure 2. (a) The 95.4 per cent credibility region in the plane 
{n m , erg} for the cases discussed in Table [T] A cross indicates the 
input values for the cosmological parameters, (b) As in panel (a) 
but treating the measurement error o"i n M obs | M tot as a nuisance 
parameter (contrary to panel (a) where the log-scatter is assumed 
to be known exactly). 



for precision cosmology. The required size of the subsample 
should grow bigger if <Tb has a strong mass dependence. 

To facilitate understanding, our simple analysis consid- 
ers a complete sample in a narrow redshift bin and only two 
cosmological parameters but our results are of general value. 
It is straightforward to generalize them including more pa- 
rameters and accounting for possible evolutionary effects in 
the baryon fraction along the past light cone and for the 
radial selection function of a realistic survey. This, however, 
is beyond the scope of this Letter. 

We conclude that considering the effect of baryons 
on the cluster mass function is central to extract unbi- 
ased estimates of the cosmological paramet ers from forth- 
comi ng large-vol ume surveys such as PES dAbbott et al~l 
120051 ) eROSIT A JPredehl et al.ll2010l ; iPillepich et al-lboilT 
ASK AP-EM U dNorris et al. II2009I ). CCAT ilRadford et al ' 
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